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ABSTRACT 

In recent years, Universities had witnessed multiplicity and diversity in the 
fields of study specialization, through increasing the number of 
Specializations’ years on one hand and creating faculties of new 
Specializations on the other hand. Our purpose is to investigate the 
students' behavior in terms of the choice of their specialization/major 
during their Bachelor studies. Through this research, we are trying to study 
the relationship between Students’ Inclinations and academic 
achievements in the chosen department in the school of Economics at 
Aleppo University using Bayesian Networks. Our findings show that there is 
no robust relationship between the students’ Inclinations and _ their 
educational achievements, which concluded that students are not choosing 
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the major based on their knowledge skills. However, the students’ scores in 
accounting subjects, before specialization, played the main role in their 


decision. 
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1. INTRODUCTION 

In recent years, Universities had witnessed an increasing 
number of specialties and diversity of study fields, through 
increasing the years of specializations on one hand and 
creating schools of new specializations on the other hand. 


The school of Economics is facing unequal distribution of 
students among the specializations. Most of the students 
are choosing some specific specializations like Banking, 
management, and accounting, while other specializations 
are left ignored, or even closed, because of the lacking and 
an inadequate number of students. Our aim is to 
investigate the students’ behavior in terms of choice of 
their specialization during their Bachelor studies in the 
school of economics at Aleppo University. The importance 
of this research comes from providing decision-makers 
with information, to determine the impact of first and 
second-year scores on student's choice of specializations. 


To understand, explain, and predict the behavior of 
humans and consumers in the workplace, we frequently 
apply probability theory to model the reality of the work. 
These models allow managers and analysts to run “what 
if’ scenarios and manipulate variables in order to better 
utilize resources and influence on human behavior. 


Bayesian networks are a widely-used class of probabilistic 
graphical models that used in various tasks _ for 
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probabilistic reasoning and causal modeling. A Bayesian 
network consists of two components: a directed acyclic 
graph that expresses the conditional independence 
relations between random variables and conditional 
probability distributions associated with each variable. 
Nodes of the DAG correspond to variables, and edges 
express the dependencies between variables. 


The structure of Bayesian network, that is, the DAG, can be 
easily visualized and may uncover some important 
characteristics of the domain, especially if the arcs are 
interpreted to be causal, or in other words, direct cause- 
effect relations and that depend on the approach of 
learning Bayesian network. 


In the simplest case, Bayesian network is specified by an 
expert and then it is used to perform inference. In other 
applications, the task of defining the network is too 
complex for humans. In this case, the network structure 
and the parameters should be learned from data. There 
are two wide classes of algorithms for automatically 
learning the graph structure of a Bayesian network from 
data Score-Based Learning and Constraint-Based Learning. 


In this study, we used Bayesian Networks to analyze the 
relationship between students’ Inclinations and the Choice 
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of Study specialization, by using the genetic algorithm 
with Bayesian score as a method to build our network. 


Z. Literature Review 

2.1. Students’ Choice of Specializations 

The focus of the marketization concept is student choice 
behavior. Regarding the student choice behavior, there is a 
growing research interest in how the _ students, as 
consumers, make their choices in higher education 
(Alexander et al., 2011). Extensive research has been 
conducted on students’ decision making regarding their 
programs and universities (Moogan & Baron, 2003; 
Vrontis, Thrassou, & Melanthiou, 2007). 


Many attributes play a role in the student decision making 
but some of them are more important than others. The 
variables that influence the student choices were divided 
into two categories (Moogan & Baron, 2003): The first 
category is about the universities and programs 
characteristics. The second category groups variables 
regarding the influencers of choice such as the sources of 
information and the influences of the students’ decision 
making(Moogan & Baron, 2003). Some of the most 
important attributes that influence the students decision 
making seem to be: the personal interest in the program, 
the labor market and the location(Van Deuren & Santema, 
2012). It seems that the most important characteristic that 
is taken into account in the decision-making is the 
personal interest in the program (Owen & Jensen, 2004). 
Also, another issue reported by the students as being very 
important in the selection process is the personal interest 
in the subject taught(Lapan, Shaughnessy, & Boggs, 1996; 
Maringe, 2006; Noble Calkins & Welki, 2006). In the 
second place, considering importance, seem to be the 
variables related to the labor market: expected earnings, 
employability, career opportunities(Van Deuren & 
Santema, 2012), variables that were found as _ being 
important in many consumer behavior researches(Malgwi, 
Howe, & Burnaby, 2005; Maringe, 2006; Noble Calkins & 
Welki, 2006). Another important aspect seems to be the 
location of the university (Van Deuren & Santema, 2012). 
Other important attributes for the students decision 
making regarding a major or a bachelor are the following: 
reputation (Maringe, 2006; Moogan & Baron, 2003; 
Moogan, Baron, & Bainbridge, 2001; Van Deuren & 
Santema, 2012), educational characteristics like the study 
materials, practical assignments(Owen & Jensen, 2004). 
According to(Worthington & Higgs, 2004), students’ 
decisions are based on these two main factors: the location 
of an institution and the reputation of a course. 


In this paper, we have taken the students’ scores for all the 
subjects in the first and the second academic year as 
groups, according to the fields of study then we studied 
their effect on the choice of specializations. Moreover, we 
have tried to discover the other hidden factors that affect 
the students’ decision. 


2.2. Bayesian networks 

The Network is a diagram, consisting of a set of nodes or 
nodes vertices, and a set of edges that bind the nodes. If all 
the edges in the chart are oriented, we get a vector chart 
Directed Graph (Greenland & Pearl, 2014). 


The name of Bayesian networks driven from the 
conditional probability rule, which is known as Bayes rule: 


_ pr(elh)*pr(h) 
p(hje) = (1) 


In another word: 


‘ Likelihood*Prior 
Posterior = ——————— (2) 
p(evidence ) 


Bayesian networks are defined as a directed acyclic graph 
or DAG, where each node is conditionally independent of 
its non-descendants, given its parents (Markov condition) 
(Bojduj, 2009). 


In general, the Bayesian network consists of the following 

components: 

> A directed acyclic graph G = G (V, E) where V 
represents the set of nodes and E is the set of edges in 
the graph G. 

> (QO, p) where 1. represents the sample space and p the 
probabilities associated with each event. 

> V = {V,,Vp,...,V,,} A set of random variables, discrete or 
continuous and _ observable random _ variables 
associated with the graph nodes. 


The random variables distributed in probabilities space 
(Q, p) where: 


P(Vi,Vo, +. Vn) = Tier pV; | pa(Yi)) (3) 
Where: pa(V; )is the parents of the node V;. 


Bayesian networks differ from Markov networks in that 
Markov networks have undirected edges. In Markov 
networks, we cannot distinguish between the cause and 
the effect, unlike the Bayesian networks in which the 
directed edges determine the affecting and the resulted 
nodes, so sometimes it is called as "Causal Bayesian 
Networks’. The network can be represented as follows: (V, 
G, P(Vi [Pa(Vi )))(Pearl, 2009) 


The basic property of Bayesian Networks is that each Edge 
Pa(Vi )—Vi represent causal relation inanition represent 
Conditional probability values: p(V; | pa(V;)) (Lauritzen, 
2001). 


In Bayesian networks, the edge shows the direct effect of 
the father variable on the son variable. Thus, the model 
assumes that there are no intermediate variables, between 
the father variable and the son variable, which mediate 
this effect. 


The Conditional Independence: 

If we havethree nodes A, B, and C, we can say that node A 
and B are independent given C, for C if and only if our 
knowledge about A does not change when our knowledge 
about B changed and the value of C is known. That mean: 
P(A|C,B)=P(A|C). In addition, we can compute the joint 
probability as follows: 

P(A,C,B)=P(C) P(B|C) P(A|B,C) = P(C) P(B|C) P(A|C) (4) 


Here we have three cases: 
1. A is conditionally independent of B_ given C: 


P(B|C,A)=P(B|C) 
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2. Ais conditionally independent of B given C 


P(B|C,A)=P(BIC), P(A|C,B)=P(A|C) 


3. Ais conditionally dependent of B given C 


P(BIC,A) #P(BIC) 


3. Data and Variables 

In this study, we investigate the students’ behavior in 
terms of choice of their specialization during their 
Bachelor studies in the Faculty of Economics at Aleppo 


The summarize of the variables was as “Table1”. 


University. When the students passed the first and second 
years of study, they have to choose one specialization to 
continue the study in it. Our dataset contains 1740 record 
about the students in the Faculty of Economics at Aleppo 
University 


Variables: 

The variables represent student score in every subject of 
the first and the second academic year, and we have 
classified these subjects into groups, as follows: 
Economics. 

Accounting. 

Management. 

Management Information Systems MIS. 

Statistics. 

English language. 


ot oN 


Table 1 the variables of the stud 


Variable 


Statistics 


Values of variables 
Not passed(Fail) 
Acceptable 


The scores that student achieved in every group of subjects in | Good 
the first and second year before they choose the specialization 


Very good 
Excellent 
Honors 


Student's graduation average in the department that he chose 
Department The department that is chosen Le 


We note that most variables represent the main subjects’ scores in the first and second year, however, "graduation" 
represents the average score of the student through the third and fourth years for all courses. 





With regard to the Department variable, we can see summarize of Department numbers in “Tablez2.”. 
Table 2 Summarize of Department 


Department No Department Name Graduated NotGraduated Total 


Economics 
Accounting 
Management 


[ge 20° AF 2 | 
yy 39a | 98 | 401 


p46 | 


Marketing 


4. 


6 55 
Statistics and Information Systems 





Sum 


4. Methodology 
Learning the Structure Bayesian Networks 


1348 1751 


It is meant to discover the causal relationships between the variables, i.e., to identify the edges between the nodes, and to 
direct them. The general principle is that we draw an arc between the nodes that have causal relations. Generally, there are 


two main ways to teach network structure (Margaritis, 2003): 


1. Ascore and search approach through the space of Bayesian network structures. 
For the data set D and network B and P (B) the prior distribution of network B, the posterior probability of network B is 
calculated as follows (Cooper & Herskovits, 1992; Yang & Chang, 2002): 


P(B,D) _— P(B)P(D|B) 


P(BID) = 5 op) ~ 5,P(@ P(E) 


(5) 


2. Aconstraint-based approach that uses conditional independencies identified in the data. 
When we test the independence between two variables, it is known that the y2value is calculated as follows(Spirtes, 


Glymour, & Scheines, 2000): 


((OJ;;-Eij)” 
x? = ye Le 7 LJ 
LJ 


(6) 
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Because the searching for the best structure of a network is a complex process of class N-P Hard(Robinson, 1977). We 
often use a genetic algorithm to search for a network structure (Perry, 2003). 





5. Results and Decision 
We learned a Bayesian Network structure by using our data and Weka software. We chose a genetic algorithm with 
Bayesian score as a method to build our networks, and then we get the network as “Fig.1”. 


Fig 1 Student scores and Specializations network 
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In addition, we can getmore information about the Strength of relations between Bayesian Networks nodes as “Table3”. 


Table 3 Strength of relations between Bayesian Networks nodes 


: : For X For Y 
»,4 ve Mutual Information Symmetric Given Y Given X Entropy X_ Entropy Y 


0.0258 0.0211 0.0245 | 0.0185 | 1.0405 1.3818 
0.0411 0.0356 0.0363 | 0.0349 | 1.1324 1.1789 
0.0383 0.0352 0.0368 | 0.0338 | 1.0405 1.1324 


Accounting 0.0274 0.0301 0.0351 | 0.0263 | 0.7808 1.0405 
0.0459 0.0511 0.0588 | 0.0452 | 0.7808 1.0152 
0.0279 0.0315 0.0357 | 0.0282 | 0.7808 0.9882 


English 0.0278 0.0255 0.0245 | 0.0265 | 1.1324 1.0502 





From the “Fig.1” and “Table3”, we can read the results as follows: 


First, the most important finding is that students’ scores in Accounting before specialization played the main role in 
students’ decision of choosing their department with strength relation. 


That means if we know the value of Accounting score for a student, then just this value will share in determining the value 
of Department: 


P (Dept. |Acc,Eco)= P (Dept. |Acc). 
Second, students’ score in Math and Statistics before specialization (x7) can help us to predict the student's average in the 


department that he/she chose (x12). Third, students’ score in Economics x1 is the most independent variable from the 
other variables. 
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Fig 3 Mesh query of Statistics variable and graduation variable 


In addition, if we run the inference by using software we can see: 
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Fig 4 An example of inference 


We can read “Fig.4” as the following: the students who chose Economics as department, 15.9, 25.3, 19.7 and 22.5 of them 
achieved a Very Good score in Economics, Accounting, management and MIS subjects respectively. That shows that 
students did not choose the economy based on their skills. 


In this research, we try to focus on how the students’ skills can release the innovation power. For example, in “Fig.5” we 
can see that when the students chose the Economics department depending on their skills then the percent of the excellent 
average of graduation will increase from 7.3% to 14.4% and the percent of failing will decrease from 22% to 17.5%. The 
Bayesian Network could show that, although of exist many latent variables unobserved. 


1 Mesh query generates a two dimensional graph, by repeatedly querying a network, while varying the values of two variables. 
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Fig 5 an example of inference 


6. Conclusions 

We note from the results, that students’ decisions did not 
depend on their orientation, but the greater role was for 
their accounting skills in selecting their majors. when we 
know that the job opportunities for accounting students 
are more than other students then we will know that these 
results indicate that market requirements significantly 
affect students’ decisions. 


Therefore, we recommend that the decision-maker at the 
faculty of economics organize workshops in order to draw 
the students’ attention to the importance of following their 
interests, developing their knowledge skills and improving 
their talents to reach to more creativity and innovation. 
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